14 research outputs found
Explaining Recurrent Neural Network Predictions in Sentiment Analysis
Recently, a technique called Layer-wise Relevance Propagation (LRP) was shown
to deliver insightful explanations in the form of input space relevances for
understanding feed-forward neural network classification decisions. In the
present work, we extend the usage of LRP to recurrent neural networks. We
propose a specific propagation rule applicable to multiplicative connections as
they arise in recurrent network architectures such as LSTMs and GRUs. We apply
our technique to a word-based bi-directional LSTM model on a five-class
sentiment prediction task, and evaluate the resulting LRP relevances both
qualitatively and quantitatively, obtaining better results than a
gradient-based related method which was used in previous work.Comment: 9 pages, 4 figures, accepted for EMNLP'17 Workshop on Computational
Approaches to Subjectivity, Sentiment & Social Media Analysis (WASSA
Ground Truth Evaluation of Neural Network Explanations with CLEVR-XAI
The rise of deep learning in today's applications entailed an increasing need
in explaining the model's decisions beyond prediction performances in order to
foster trust and accountability. Recently, the field of explainable AI (XAI)
has developed methods that provide such explanations for already trained neural
networks. In computer vision tasks such explanations, termed heatmaps,
visualize the contributions of individual pixels to the prediction. So far XAI
methods along with their heatmaps were mainly validated qualitatively via
human-based assessment, or evaluated through auxiliary proxy tasks such as
pixel perturbation, weak object localization or randomization tests. Due to the
lack of an objective and commonly accepted quality measure for heatmaps, it was
debatable which XAI method performs best and whether explanations can be
trusted at all. In the present work, we tackle the problem by proposing a
ground truth based evaluation framework for XAI methods based on the CLEVR
visual question answering task. Our framework provides a (1) selective, (2)
controlled and (3) realistic testbed for the evaluation of neural network
explanations. We compare ten different explanation methods, resulting in new
insights about the quality and properties of XAI methods, sometimes
contradicting with conclusions from previous comparative studies. The CLEVR-XAI
dataset and the benchmarking code can be found at
https://github.com/ahmedmagdiosman/clevr-xai.Comment: 37 pages, 9 tables, 2 figures (plus appendix 14 pages
Explainable sequence-to-sequence GRU neural network for pollution forecasting
Abstract The goal of pollution forecasting models is to allow the prediction and control of the air quality. Non-linear data-driven approaches based on deep neural networks have been increasingly used in such contexts showing significant improvements w.r.t. more conventional approaches like regression models and mechanistic approaches. While such deep learning models were deemed for a long time as black boxes, recent advances in eXplainable AI (XAI) allow to look through the model’s decision-making process, providing insights into decisive input features responsible for the model’s prediction. One XAI technique to explain the predictions of neural networks which was proven useful in various domains is Layer-wise Relevance Propagation (LRP). In this work, we extend the LRP technique to a sequence-to-sequence neural network model with GRU layers. The explanation heatmaps provided by LRP allow us to identify important meteorological and temporal features responsible for the accumulation of four major pollutants in the air ( PM 10 , NO 2 , NO , O 3 ), and our findings can be backed up with prior knowledge in environmental and pollution research. This illustrates the appropriateness of XAI for understanding pollution forecastings and opens up new avenues for controlling and mitigating the pollutants’ load in the air
Test set performance of the ML models for 20-class document classification.
<p>Test set performance of the ML models for 20-class document classification.</p
Results averaged over 10 random data splits.
<p>For each semantic extraction method, we report the dimensionality of the document summary vectors, the explanatory power index (EPI) corresponding to the <i>maximum</i> mean KNN accuracy obtained when varying the number of neighbors <i>K</i>, the corresponding standard deviation over the multiple data splits, and the hyperparameter <i>K</i> that led to the maximum accuracy.</p
KNN accuracy when classifying the document summary vectors.
<p>The accuracy is computed on one half of the 20Newsgroups test documents (other half is used as neighbors). Results are averaged over 10 random data splits.</p
LRP heatmaps of the document sci.space 61393 for the CNN2 and SVM model.
<p>Positive relevance is mapped to red, negative to blue. The color opacity is normalized to the maximum absolute relevance per document. The LRP target class and corresponding classification prediction score is indicated on the left.</p